Goto

Collaborating Authors

 coliee 2023


Enhancing Legal Document Retrieval: A Multi-Phase Approach with Large Language Models

Nguyen, Hai-Long, Nguyen, Duc-Minh, Nguyen, Tan-Minh, Nguyen, Ha-Thanh, Vuong, Thi-Hai-Yen, Satoh, Ken

arXiv.org Artificial Intelligence

GPT-4, and LLaMA, are increasingly prevalent. Numerous studies have explored effective prompting techniques to harness the power of these LLMs for various research problems. Retrieval, specifically in the legal data domain, poses a challenging task for the direct application of Prompting techniques due to the large number and substantial length of legal articles. This research focuses on maximizing the potential of prompting by placing it as the final phase of the retrieval system, preceded by the support of two phases: BM25 Pre-ranking and BERT-based Re-ranking. Experiments on the COLIEE 2023 dataset demonstrate that integrating prompting techniques on LLMs into the retrieval system significantly improves retrieval accuracy. However, error analysis reveals several existing issues in the retrieval system that still need resolution.


CAPTAIN at COLIEE 2023: Efficient Methods for Legal Information Retrieval and Entailment Tasks

Nguyen, Chau, Nguyen, Phuong, Tran, Thanh, Nguyen, Dat, Trieu, An, Pham, Tin, Dang, Anh, Nguyen, Le-Minh

arXiv.org Artificial Intelligence

The Competition on Legal Information Extraction/Entailment (COLIEE) is held annually to encourage advancements in the automatic processing of legal texts. Processing legal documents is challenging due to the intricate structure and meaning of legal language. In this paper, we outline our strategies for tackling Task 2, Task 3, and Task 4 in the COLIEE 2023 competition. Our approach involved utilizing appropriate state-of-the-art deep learning methods, designing methods based on domain characteristics observation, and applying meticulous engineering practices and methodologies to the competition. As a result, our performance in these tasks has been outstanding, with first places in Task 2 and Task 3, and promising results in Task 4. Our source code is available at https://github.com/Nguyen2015/CAPTAIN-COLIEE2023/tree/coliee2023.


NOWJ at COLIEE 2023 -- Multi-Task and Ensemble Approaches in Legal Information Processing

Vuong, Thi-Hai-Yen, Nguyen, Hai-Long, Nguyen, Tan-Minh, Nguyen, Hoang-Trung, Nguyen, Thai-Binh, Nguyen, Ha-Thanh

arXiv.org Artificial Intelligence

This paper presents the NOWJ team's approach to the COL-IEE 2023 Competition, which focuses on advancing legal information processing techniques and applying them to real-world legal scenarios. Our team tackles the four tasks in the competition, which involve legal case retrieval, legal case entailment, statute law retrieval, and legal textual entailment. We employ state-of-the-art machine learning models and innovative approaches, such as BERT, Longformer, BM25-ranking algorithm, and multi-task learning models. Although our team did not achieve state-of-the-art results, our findings provide valuable insights and pave the way for future improvements in legal information processing.


THUIR@COLIEE 2023: More Parameters and Legal Knowledge for Legal Case Entailment

Li, Haitao, Wang, Changyue, Su, Weihang, Wu, Yueyue, Ai, Qingyao, Liu, Yiqun

arXiv.org Artificial Intelligence

This paper describes the approach of the THUIR team at the COLIEE 2023 Legal Case Entailment task. This task requires the participant to identify a specific paragraph from a given supporting case that entails the decision for the query case. We try traditional lexical matching methods and pre-trained language models with different sizes. Furthermore, learning-to-rank methods are employed to further improve performance. However, learning-to-rank is not very robust on this task. which suggests that answer passages cannot simply be determined with information retrieval techniques. Experimental results show that more parameters and legal knowledge contribute to the legal case entailment task. Finally, we get the third place in COLIEE 2023. The implementation of our method can be found at https://github.com/CSHaitao/THUIR-COLIEE2023.


THUIR@COLIEE 2023: Incorporating Structural Knowledge into Pre-trained Language Models for Legal Case Retrieval

Li, Haitao, Su, Weihang, Wang, Changyue, Wu, Yueyue, Ai, Qingyao, Liu, Yiqun

arXiv.org Artificial Intelligence

Legal case retrieval techniques play an essential role in modern intelligent legal systems. As an annually well-known international competition, COLIEE is aiming to achieve the state-of-the-art retrieval model for legal texts. This paper summarizes the approach of the championship team THUIR in COLIEE 2023. To be specific, we design structure-aware pre-trained language models to enhance the understanding of legal cases. Furthermore, we propose heuristic pre-processing and post-processing approaches to reduce the influence of irrelevant messages. In the end, learning-to-rank methods are employed to merge features with different dimensions. Experimental results demonstrate the superiority of our proposal. Official results show that our run has the best performance among all submissions. The implementation of our method can be found at https://github.com/CSHaitao/THUIR-COLIEE2023.